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In this work we present a theoretical analysis of the convergence of the Wang-Landau algorithm 
[Phys. Rev. Lett. 86, 2050 (2001)] which was introduced years ago to calculate the density of states 
in statistical models. We study the dynamical behavior of the error in the calculation of the density 
of states. We conclude that the source of the saturation of the error is due to the decreasing variations 
of the refinement parameter. To overcome this limitation, we present an analytical treatment in 
which the refinement parameter is scaled down as a power law instead of exponentially. An 
extension of the analysis to the AM old way variation of the method is also discussed. 


I. INTRODUCTION 

Monte Carlo methods are well suited for the simulation 
of large many-body problems, particularly for the study of 
phase transitions and critical phenomena. 1 ' 2 

Many conventional Monte Carlo algorithms such as Me- 
tropolis importance sampling, Swendsen-Wang cluster 
flipping, 4 etc., generate a canonical distribution at a given 
temperature. Such distributions are so narrow that, with con- 
ventional Monte Carlo simulations, multiple runs are re- 
quired to determine thermodynamic quantities over a signifi- 
cant range of temperatures. 

For a second-order phase transition in unfrustrated sys- 
tems, the problem of critical slowing down was solved by a 
cluster update algorithm. 4 For first-order phase transitions 
and in systems with many free energy minima such as frus- 
trated magnets or spin glasses, a similar problem of long 
tunneling times between local minima arises. 

Several methods were developed to overcome this prob- 
lem, including the multicanonical method, 5-14 and related en- 
tropic sampling method, 15 broad histogram method, 16-18 
multibonding simulations, 19 etc. 

The multicanonical ensemble method proposed by Berg 
et a!.' 7 estimates the density of states (DOS) g(E) (the num- 
ber of all possible states or configurations for an energy level 
E of system) first, then performs a random walk with a flat 
histogram in the desired region in the phase space. This 
method has been proven as very efficient in studying the 
first-order phase transitions, where simple canonical simula- 
tions have difficulty in overcoming the tunneling barrier be- 
tween coexisting phases at the transition temperature. 5 ' 8-14 

The entropic sampling method, which is basically 
equivalent to multicanonical ensemble sampling, 15 is in itera- 
tive process used to calculate the microcanonical entropy de- 
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fined as 5 , (£’)=ln[g(£)]. The method was applied to the two- 
dimensional (2D) ten-sate (Q= 10) Potts model and the three- 
dimensional (3D) Ising model. 15 

The broad histogram method introduced by de Oliveira 
et c//. 16-is calculates the density of states by estimating the 
probabilities of possible transitions between all possible 
states of a random walk in energy space. However, the 
method presents systematic errors even for simple models 
such as the Ising model at small system sizes. 18 In fact, those 
methods based on accumulation of histogram entries have 
the problem of scalability for large systems, suffering from 
systematic errors when systems are large. 20 

Wang and Landau 20 introduced a new and efficient algo- 
rithm using a random walk in energy space to obtain an 
estimation of the density of states for statistical models. The 
method has been successfully used in many problems of sta- 
tistical physics, biophysics, and others. 20- " 4 The method is 
based on independent random walks which are performed 
over adjacent overlapped energy regions, providing the den- 
sity of states. In that way, thermodynamic observables, in- 
cluding the free energy over a wide range of temperature, 
can be calculated with one single simulation. 

Since Wang and Landau introduced the multiple-range 
random walk algorithm to calculate the density of states, 
there have been numerous improvements proposed 29-38 and 
studies of the efficiency and convergence of this 
method. ’ ’ Particularly, Zhou and Batt " have presented a 
mathematical analysis of the Wang-Landau (WL) algorithm. 
They give a proof of the convergence and the sources of 
errors for the WL algorithm and the strategies for improve- 
ment. The main conclusions of their work are (i) the density 
of state is encoded in the average histogra m; (i i) the fluctua- 
tions of the histogram, proportional to 1 / \ In /, cause statis- 
tical errors, which can be reduced by averaging over multiple 
g(E)\ (iii) the correlation between adjacent records in the 
histogram introduces a systematic error, which is reduced at 
small /. 

Earl and Deem in Ref. 38 have derived a general condi- 


tion for the convergence of a Monte Carlo method whose 
history dependence is contained within the simulated density 
distribution. The authors concluded that (i) the detail balance 
needs only be satisfied asymptotically and (ii) the calculated 
density of states approaches to the real one with an error 
proportional to 1 It. 

On the other hand, several limitations of the method re- 
main still unsolved, as, for example, the behavior of the tun- 
neling time which is a bound for the performance of any 
flat-histogram algorithm, as is discussed in Ref. 36, where it 
is shown that it limits the convergence in the WL algorithm. 

Berg have proposed a multicanonical simulation scheme 
where the first part is the WL algorithm and the second part 
is a standard Markov chain Monte Carlo simulation for 
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which convergence is proven/ 

Berg and Janke have introduced a cluster version of the 
Wang-Landau algorithm together with a subsequent multi- 
bondic simulation. This method improves the efficiency of 
the conventional WL or multicanonical approach by power 
laws in the lattice size for systems such as 2D and 3D Ising 
models. 19 

Although several authors say that the WL algorithm 
converge, 32 ' 38 one of the limitations of the method and the 
subsequent improved algorithms based on it is the saturation 
in the error. 

The saturation in the error and therefore the nonconver- 
gence of the WL method were also suggested by Landau and 
co-workers already in Refs. 37 and 40. Particularly the im- 
provement of the A- fold way variation introduced by 
Malakis et al , 41 does not avoid the saturation in the error. 

In fact, those methods in which the refinement parameter 
vary lower faster than 1 It (with t the Monte Carlo time) 
determine that the calculated density of states reaches a con- 
stant value for long times, therefore the error saturates. To 
overcome this limitation, we recently introduced a modified 
algorithm in which the refinement parameter is scaled down 
as l/t instead of exponentially. 4- This new algorithm allows 
the calculation of the density of states faster and more accu- 
rately than with the original WL algorithm due to the fact 
that the calculated density of state function approaches as- 
ymptotically the exact value. 

In this work we present a theoretical demonstration of 
the saturation in the error in the WL algorithm and deduce 
analytically the optimum choice of the refinement parameter 
F as F(t) = l/t, without using the histogram flatness condi- 
tion as was introduced in Ref. 42. The resulting algorithm is 
more efficient than the original WL method and other varia- 
tions like those obtained by using A-fold way algorithm. 30 ' 41 
Due that the exact density of state is, in general, rarely 
known, we implement all the calculations in a two- 
dimensional LXL Ising model, where g(E) is exactly known 
for size up to L= 50. 4j 

The outline of the paper is as follows. In Sec. II, we 
introduce the time behavior of the Wang-Landau algorithm, 
demonstrating the origin of the saturation of the error. In Sec. 
Ill, we give the analytical bases of the l/t algorithm intro- 
duced previously in Ref. 42. In Sec. IV, we discuss the dy- 
namical behavior of the A-fold way version of the WL algo- 


rithm, introducing the corresponding 1 It modification of our 
algorithm using such method. Finally, in Sec. V, we discuss 
the results and give our conclusions. 


II. THEORETICAL ANALYSIS OF THE CONVERGENCE 
OF THE WANG-LANDAU ALGORITHM 


In the original Wang-Landau algorithm, 20 an initial en- 
ergy range of interest is identified, £ min ^ £ max , and a 
random walk is performed in this range. During the random 
walk, two histograms are updated: one for the density of 
states g(E), which represents the current or running estimate, 
and one for number of visits to distinct energy states H(E). 
Before the simulation begins, H(E) is set to zero and g(E) is 
set to unity, both uniformly. The random walk is performed 
by choosing an initial state i in the energy range E min ^E 
*s£ max . Trial moves are then attempted and moves are ac- 
cepted according to the transition probability 


p(Ei 


Ef ) = min 


, g(Ej) 

’g(£y)_r 


( 1 ) 


where Ej(Ej) are the initial (final) states, respectively, and 
p(£,-— >£y) is the transition probability from the energy level 
£,■ to Ef. Whenever a trial move is accepted, a histogram 
entry corresponding to state n is incremented according to 
H(E n )=H(E n )+l and g(E„)=g(E„)Xf, where / is conver- 
gence factor, that is, generally initialized as e. If a move is 
rejected, the new configuration is discarded and the histo- 
gram entry corresponding to the old configuration is incre- 
mented according to H(E 0 )=H(E 0 )+ 1; at the same time the 
density of states is incremented according to g(E 0 )=g(E 0 ) 
X/. This process is repeated until the energy histogram be- 
comes sufficiently flat. When that happens, the energy histo- 
gram H is reset to zero everywhere and the convergence 
factor is decreased, usually according to f k+ \ = 'lf k , where f k 
is the convergence factor corresponding to stage k. The pro- 
cess is continued until / becomes sufficiently close to 1 [say, 
/<ex p(10“ 8 )]. 

To analyze the dynamical behavior of the WL algorithm, 
it is necessary to define different quantities. The histogram 
H'(E,t) is defined in an analogous way as the histogram H 
used in the WL algorithm, but is not reset in the whole simu- 
lation. Its mean value after time t is defined as 


{H'(t)) = ^y j H'{E,t), 

N E 


( 2 ) 


where N is the number of states of different energies and 
H'(E,t) stands for the mean height of the histogram in E at 
time t (with t=j/N, where j is the number of trial moves 
attempted). The time t is related to the numbers of accessible 
energy states. For example, in a two-dimensional Ising 
model the time t is given by t= j/(L 2 - 1), where L is the size 
of the system (if we consider only one energy window to 
calculate the DOS). 

Instead of the density of states it is more convenient to 
define S(E,t) = ln(g(E,t)), usually named entropy. This defi- 
nition is generally used in order to fit all possible values of 
g(E) into double precision numbers. Its mean value is de- 
fined as 


( 10 ) 


(S(t))=^2s(E,t). 

N E 

The error e(E,t) is defined as 


e(E,t) = 


1 - 


In [g n {E,t)\ 


In [g e x(£)] 
and its mean value as 

1 


S sx {E) - S n (E,t) 


S ex (E) 


<*«> = 


(»-D? e(£ '' ) - 


(3) 


(4) 


(5) 


g n (E,t) is normalized with respect to the exact DOS at the 
ground state, that is. 


= g(E,t)g ex (E c ) 
g(E c ,t ) 


(6) 


where g(E,t), g ex (E c ), and g(E G ,t) are the simulated value, 
the exact values at the ground state, and the simulated value 
at the ground state of the DOS, respectively. Therefore, 

S n (E, t ) = In [g n (E, f)] = S(E, t ) - S(E g , t ) + S sx (E c ) . (7) 

It is also convenient to define the function F=ln[/], then 
F k+ i=F k lm (with m> 1). 

Note that F k is always positive and monotonically de- 
creasing. 

It is necessary to emphasize that Eq. (6) breaks down for 
any classical continuous system, for which the entropy di- 
verges both at very low and very high energies, therefore 
other normalization schemes must be introduced in order to 
avoid the divergence of the entropy. For instance, choosing 
other reference state with energy E' in such a way that 
gexCE')* 0. 

Then S(E,t) can be written as 

t 

S(E,t) = 2 [H'(E,i) - H'(E,i - l)]F M , (8) 

i=i 


where F 0 =const [usually, F 0 =log(e) =0.434 294 4], 

The flatness condition of the histogram H(E,t) and Eq. 
(1) guarantee that F i+] takes the value F, a finite number of 
times before eventually decreasing to F,//m. Moreover 
H'{E,i)-H'(E,i- 1) is finite. Since 'E J k=] F k is convergent, 
the series S(E,t) is also convergent for any value of m. 

Let us rewrite the error given in Eq. (4) as 


e{E,t) = 


\AS{E,t)-AS ex [E,t)\ 

S ex (E) 


(9) 


where A S(E,t) = S(E,t)-S(E c ,t) and AS ex (E)=S ex (E) 
-S ex (E c ). Since S(E,t) and S(E G ,t) are convergent, 
AS(E,t )^ const when r— ><». However, due to the stochastic 
character of the Markov process, both quantities are not sta- 
tistically equal, AS(E ,t) =F AS ex (E), then the mean value of 
the error goes to a constant limit as time increases to infinity. 

Let us show, as an example, the calculation of the error 
for the 2D Ising model at the boundaries of the energy range, 
E=2L 2 . Note that the ground state energy is E=-2L 2 for this 
model. It is well known that 5 , ex (-2L 2 )=S ex (-2L 2 ) = ln(2), 
hence 


S(2L 2 ,t)-S(- 2 L 2 ,t) 

S ex (2L 2 ) 

Then, using Eq. (8) and the argument of convergence, 
one can conclude that S(2L 2 ,t) ^ K(2L 2 ) and S(-2L? j) 
— >F(-2F 2 ) when f— >°°, where K(2L?) and K(-2L 2 ) are con- 
stant. Due to the stochastic character of the process, these 
quantities are not statistically equal, therefore e(2 L 2 ,t) satu- 
rates, and the mean value of the error (e) saturates as well. 
Similar arguments can be used to demonstrate the saturation 
of the error for each energy level. In the present case of the 
2D Ising model, the exact value of the density of states 
g ex (E) was obtained from the method developed in Ref. 43. 

In conclusion, the WL algorithm does not converge to 
the true value of the DOS because the series given in Eq. (8) 
is convergent. 


e{2 L 2 ,t) = 


III. AVOIDING THE SATURATION OF THE ERROR: 

THE 1/f ALGORITHM 

In order to avoid the saturation of the error, the series 
given in Eq. (8) must be made divergent, with F k monotoni- 
cally decreasing and F k tending to zero as fast as possible. 
The last condition is not arbitrary since it guarantees that 
g(E,t)^g ex (E) rapidly, reducing the computational time. 

The series which fulfills such conditions is the harmonic 
series S£L 0 1 IkP with f I , the fast converging being the one 
with p = 1 . 

In order to make the WL algorithm more efficient, the 
refinement parameter must take, for long times, the follow- 
ing time functionality: 

F(t) = - t , ( 11 ) 

which is updated at each Monte Carlo step. 

Initially, F 0 = 1 , then at short time, and as was proposed 
previously, 42 we choose F(t)> 1 It (i.e., F i+l =Fj/2 as in the 
original WL algorithm) in such a way that the algorithm 
visits all the energy configurations as fast as possible. As 
soon as 1 It, that is at time t c , the algorithm changes and 
the refinement parameter takes the time functionality de- 
scribed above. Then, using Eq. (8), the entropy takes the 
form 


S(E,t)= 2 [H'(E,k)-H'(E,k- 1)]F* 


+ 2 [H'(E,k)-H'{E,k- 1)]-, 

z — i k 


(12) 


for t^t c , F k+l =F k /2, or F k+1 =F k , 

From the definition of Eq. (3), and replacing the expres- 
sion of S(E,t ) given in Eq. (12), we obtain 



FIG. 1. (a) Comparison between the error calculated using the original 
WL method with different flatness conditions and the error obtained using 
our 1/f algorithm, for a two-dimensional Ising model with nearest-neighbor 
lateral interaction in squares lattice with L- 8. In the inset ( S(t )) is shown vs 
time t, in semilog plot, to emphasize its logarithmic behavior, (b) Behavior 
of the error as a function of time, for different values of the parameter m for 
a flatness condition of 95%. 


<■ S(t )) = ^2 2 - H'(E,k - l)]F k 

N E k= i 

+ 2[H'(E,k)-H'(E,k- 1)]-. (13) 

k=t c k 

We can rearrange the last equation, summing first on the 
energy range, then considering that on average ( 

- 1))~ 1, to finally obtain 

(S(t)) * In (t). (14) 

This quantity is necessary to show the dynamics of the algo- 
rithm and not to obtain the observable which are calculated 
using the expression of the entropy given in Eq. (12). 

In Fig. 1(a), we show a comparison between the error 
(e(t)) calculated using the WL method with different flatness 
conditions and the error obtained using our algorithm, for the 
Ising model in a 2D squares lattice with L = 8 and F k+ 1 
=F k /m (with m= 2). In the inset, the behavior of the ( S(t )) is 
shown in semilog plot for our algorithm. As one can see, 
after the initial time t c the function ( S ) takes the form given 
in Eq. (14). In Fig. 1(b) we show the behavior of the error as 
a function of time, for different values of the parameter m 
and flatness condition of 95%. The accuracy of the WL 
method decreases as m increases. 

Note that the error, in our algorithm, is proportional to 
1 / \ / instead of 1 It, as is predicted by Earl and Deem in Ref. 
38. 

Note that instead of Eq. (11), we can propose a more 
general time dependence for the refinement parameter as 

F(.t)=°-. (15) 

However, a simple observation of the behavior of F(t) as a 
function of p and c and the corresponding effect on the error 
(e(t)) show that the best choice to optimize the efficiency of 


FIG. 2. (a) Dynamical behavior of F(t ) for different values of the exponent 
p and c = 1. (b) Error (e(t)) vs time t, for the same values of p and c= 1 [see 
Eq. (15)]. 

the algorithm is that for p= 1 and c=l. In fact, in Fig. 2(a) 
we show the function F(t) as a function of t for different 
values of the exponent p with c- I . The effect on the error 
( e( t )) is shown in Fig. 2(b). For 1, the error goes as (e) 
« \jF(t) for long time (asymptotic regime), one also observes 
that the value which optimizes the error is p=l. As is dis- 
cussed above for p> 1, the error saturates. In Figs. 3(a) and 
3(b) we show the function F(t ) and the corresponding error 
(e(f)), as a function of t for p= 1 and three different values of 
c(c=0.1 , 1 , 10). The error goes as (e{t))^ 4c/~t for 1 and 
changes the time behavior for c < 1 . In fact, the error follows 
a power law dependence with an exponent bigger than -0.5. 
However, in all cases the error does not saturate. 

In Fig. 4 we show the error ( e(t )) versus the refinement 
parameter F calculated using the original WL algorithm, for 
different flatness conditions. 

Note that the behavior of the error follows the law \F 
before the error saturates, confirming the conclusion (ii) pre- 
dicted in Ref. 32. On the other hand, no matter how large the 
flatness condition is, the error saturates in all cases for 
smaller values of F, demonstrating that the calculated DOS 
does not converge to the exact value; this fact is in contra- 
diction with the third conclusion predicted by Zhou and Batt 
in Ref. 32. 



t 


FIG. 3. (a) Dynamical behavior of F(t) for different values of c and p- 1. 
(b) Error (e(t)) vs time t, for the same values of c and p= 1 [see Eq. (15)]. 



fore, one has for the probability that exactly n random num- 
bers will result in a new configuration, 

P n = P(l -P) n ~ l . (19) 

Thus the average lifetime becomes 

oo oo 2 

T = 2 nP n = 2 nP„( 1 - P)' l ~ l = (20) 

n= 1 n= 1 Um 


FIG. 4. Error (e(r)) vs the refinement parameter, calculated using the origi- 
nal WL method, for different values of the flatness condition (from top to 
bottom, 70%, 80%, 90%, 95%, 97%, and 99%). 


IV. ON THE 1/f ALGORITHM USING THE A/-FOLD WAY 

In this section we discuss the dynamical behavior of the 
/V- fold version of the WL algorithm. 30 ' 41 in order to show 
that also in this case the error saturates. 

To overcome such a problem we introduce a new version 
of our method, the A-fold way 1 It algorithm. As in original 
version of the WL /V-fold way algorithm, 30 we assume a spin 
system which may be in state a with energy Eel 
= [£min>£max]> whereby I denotes the energy range for which 
we wish to estimate g(E). All spins are then partitioned into 
classes according to their energetic local environment, i.e., 
the energy difference A £, a spin flip will cause. For the spe- 
cial case of the two-dimensional nearest-neighbor Ising 
model, each spin belongs to one of only M= 10 classes. The 
total probability P of any spin of class i being flipped is 
given by 

E(A£ ; ) = n((r,AE t )p(E -^ £ + A£,-), f = 1, ...,Af, (16) 

with n(cr,A£,) being the number of spins of state a which 
belong to class i and p(£^£+A£,) being given by 

p(E->E+AE ; ) 

I min( 1 ,g(E)/g(E + A £,-)) , if E + A £ ; e I 
~[o, if E + AE, $ I. 

(17) 


Based on these definitions, we can describe the /V-fold 
way version of the 1 / ^-algorithm as follows. 

(i) Choose an initial configuration and set H(E) = 0, 
5(E) = 0 for all £: t= 0, E 0 =log(e) = 0.434 294 4 and 
also fix E fin „| . 

(ii) Determine (update) the probabilities p(£^£+A£,-) 
and the Q m ’ .v of the (initial) configuration using Eqs. 
(16) — (18). 

(iii) Determine the average lifetime r of (initial) configu- 
ration via Eq. (20). 

(iv) Increment histogram, density of states, the time t, and 
update /,, 

H(E) -> H(E) + 1 , 

5(£) — > 5(£) + A5(£) , 
t—*t + t/N, 


with 

A5(£) = < 
and 

F i+ 1 


£,t if £,t=S £ () 
,£ 0 if £,t> E 0 , 


£, if F,r^F 0 
A 5(£)/r if F,t>F 0 . 

In the case where £,■/ F i+l > 2, we set 


( 21 ) 


( 22 ) 


To determine the class from which to flip a spin, one calcu- 
lates the numbers 

Qm = 2 m£,), m=l, ... ,M and 0 o = O, (18) 

i^m 

which are the integrated probabilities for a spin flip within 
the first m classes. Hence a class is selected by generating a 
random number RND such that 0 < RND*S Q M . and taking 
class m if 0„,-i < RND “S Q ln . The spin to be flipped is cho- 
sen from this class with equal probabilities. The numbers 
«(tr,A£,) change upon flipping the spin. Due to the flip, the 
spin and its interacting neighbors will change classes and 
correspondingly the numbers «(cr,A£,) will differ from their 
predecessors. Finally, one has to determine the average life- 
time t of the resulting state, i.e., one has to find out how 
many times the move just made would be rejected on aver- 
age in the usual update scheme. The probability that the first 
random number would produce a flip is P=Q M /L 2 . There- 


F j- +1 — > £ ,72 . (23) 

(v) After some fixed sweeps check H(E)\ if H ( E) 
=£0 V£ then refine £ ; according to F i+ 1 = Fj/2 and 
H(E) is reset. 

(vi) If F i+1 ^l/t then F i+l =F=l/t. Then H(E) and Eqs. 
(22) and (23) are not used in the rest of the simula- 
tion, while from Eq. (21) we only use A5 (£)=£,t. 

(vii) If £<£fj na i the simulation stops. 

(viii) Determine the Q m ’ s [the p(E — ■ E+ A £)■) ’ v are not up- 
dated here]. 

(ix) Choose and flip spin as described above. 

(x) Go to (ii). 

The initial value of the refinement parameter E 0 has a 
crucial importance at the beginning of the algorithm and can 
determine the subsequent dynamics. In fact, we observe that 
choosing £ 0 small, the algorithm wastes a long time visiting 
the most accessible states. On the contrary, if E 0 is large 



FIG. 5. (a) Refinement parameter F(l) vs time and (b) dynamical behavior 
of the error for the four methods mentioned in the text. 

enough, the algorithm wastes a long time visiting the least 
accessible states. The more efficient election is thus F 0 
— 0.5. 

As in the algorithm introduced in Ref. 30, A S{E) is kept 
below or equal F 0 , avoiding that Q M = 0 which would termi- 
nate the iteration procedure described above. The time t is 
the accumulate average lifetime used as a refinement param- 
eter in our algorithm, similarly to 1/f in the algorithm intro- 
duced previously in Ref. 42. 

In Fig. 5(b), we have plotted the dynamical behavior of 
the average error ( e(f )) as a function of the Monte Carlo 
time, both the original and the N - fold way version of the WL 
algorithm saturate, while our 1 It version and the correspond- 
ing /V-fold way variation do not. A comparison of these 
curves demonstrates that our methods are in both cases more 
efficient than the original algorithms. In Fig. 5(a), we have 
plotted the corresponding refinement parameter for each al- 
gorithm. As one observes, lit behaves as limiting curves, as 
soon as the refinement parameter for the WL algorithm in 
both the original and the IV- fold way versions is lower than 
this limiting curve, the corresponding errors saturate. 

Clearly, the error saturates also in the AM old way varia- 
tion of the WL algorithm. The reason of such behavior was 
explained in Sec. II of the present paper. In fact, on the 
average F i r i >F i+1 T i+ \ and due to that S(E,t ) can be written 
as a series, with a kernel Fj, then the series that it represents 
this function is convergent. Using the same argument, to 
avoid the saturation in the error, we have to choose the re- 
finement parameter as F=l/t. 

V. DISCUSSION AND CONCLUSIONS 

In this work, we discussed one of the weaknesses as of 
the well known Wang-Landau algorithm, namely, the satu- 
ration of the error or the nonconvergence of calculated den- 
sity of states to the exact value. We presented an analytical 
demonstration of the nonconvergence to the exact DOS in 
the original version of the WL algorithm. We have also 
shown that the saturation in the error appears not only in the 
original version of the WL algorithm but in the N- fold way 
variation of such method. 30 Alternatively, we deduced ana- 
lytically the way to avoid the saturation of the error and gave 


an adequate form to the refinement parameter. This new al- 
gorithm, the so-called lit algorithm already introduced by us 
in Ref. 42, was then extended within the AMold way scheme. 

The nonconvergence of the original WL algorithm and 
other previous version, including A'- fold way method, 
seemed very difficult to believe. However, in view of our 
results we are able to discuss some of these statements raised 
by other authors. 32 ' 8 Particularly, some of the conclusions 
presented by Zhou and Batt 32 are not satisfied completely by 
the original WL algorithm. In fact, the second conclusion is 
true before the error saturates, after that it is no longer valid. 
On the other hand, the correlation between adjacent records 
in the histogram introduces a systematic error, which is not 
reduced by small F as is demonstrated in the present paper, 
therefore the third conclusion is not valid. 

It is interesting to emphasize that Landau et al. in Ref. 
41 suggest that ln(/ flnal ) cannot be chosen arbitrarily small or 
the modified ln[g(£)] will not differ from the unmodified one 
to within the number of digits in the double precision num- 
bers used in the simulation. If this happens, the algorithm no 
longer converges to the true value, and the program may run 
forever. If ln(/ fmal ) within the double precision range is too 
small, the calculation might take excessively long to finish. 

Summarizing, in this work, we have presented an ana- 
lytical proof of the origin of the error saturation in the WL 
algorithms and a method to avoid it. We have chosen, as a 
test laboratory, a discrete system, namely, the Ising model. 
However, the mathematical arguments of the source of the 
error for the WL algorithm seem to be more general and can 
be extended to all algorithms which consider a refinement 
parameter that change, according to the flatness condition of 
the energy histogram, with a law that decreases faster than 
1/f. In all these cases, a saturation of the error for the calcu- 
lation of the density of states can be guaranteed. In fact, due 
that the entropy S(E,t) can be expressed as a series in which 
F(t) is the kernel. In those algorithms where F k -F k _ x !m 
(with any value of m> 1), the resulting series converges to a 
finite value, then the error saturates. 

The simplest way to avoid the saturation in the error is to 
choose a refinement parameter that depends on time as 
F(t) = t~ p with 1 (the optimum choice is p= 1). In these 
cases the series becomes divergent and the calculated density 
of states approaches asymptotically to the exact values as 
ccf-p' 2 ' This choice results in a more accurate and more effi- 
cient algorithm than other methods. 

Finally, we have considered the extension of our algo- 
rithm to the AM'old way method. It was shown that the ad- 
equate refinement parameter is F=l/t, where t is the accu- 
mulate average lifetime. This definition ensures that the DOS 
approaches asymptotically the exact value more efficiently 
than using the A-fold way original WL algorithm which satu- 
rates at long times. 
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